home *** CD-ROM | disk | FTP | other *** search
- Arasan - Chess for Windows - version 1.2
- Programmer's Guide
-
- by Jon Dart
-
- Arasan is a chess program for Windows 3.1. It is copyrighted
- but under terms that allow free distribution for non-commercial
- purposes.
-
- This archive contains the source code for Arasan. I am providing
- the source code for the use of programmers who may wish to
- understand how the program works or to modify it for their own use.
- However, you may not distribute a modified version of this program
- without the consent of the original author.
-
- Arasan is written in C++ and was compiled with the Borland C++
- compiler, version 4.0 (the source can also be compiled under
- Borland C++ version 3.1). Arasan uses the Windows++ class library
- by Paul DiLascia. Source for Windows++ is not included in this
- distribution. It is available for $25 from:
-
- Paul DiLascia
- P.O. Box 391632
- Cambridge, MA 02139
-
- Note: I have no connection with Paul DiLascia. I am just a user of
- his library. In a future version of Arasan I hope to convert to using
- a different, more widely available class library such as MFC or OWL.
- I have made a few bug fixes in the Windows++ source code; see the
- file WPP.DIF for details of what was changed.
-
- The remainder of this file contains information for use by programmers
- reading or working on Arasan source code. I assume that you have a
- working familiarity with C++, Windows programming, and the Borland
- compiler and tools.
-
- Building Arasan
-
- I have attempted to make the source code for Arasan as portable as
- possible, but I don't guarantee that it can be built with
- any compiler except Borland C++. In particular, since it uses
- templates, it cannot be compiled with Microsoft Visual C++ 1.0
- or 1.5.
-
- To compile Arasan, first edit the "make.cfg" file so that BCCDIR
- points to your installation of Borland C++, and WPPDIR points to your
- installation of Windows++. If you are using Borland C++ 3.1, you
- will need to remove the -x- switch from the CFLAGS variable defined
- in this file.
-
- Then just type "make" from the source directory. Make will compile
- and link the executable (arasan.exe). I recommend that you have a
- minimum of 4 Megabytes of memory in order to build the source. If
- you are compiling in a DOS box and run out of memory, try exiting
- Windows and running "make" from DOS.
-
- Opening book format
-
- The opening book for Arasan is composed of two binary files, one for the
- White moves ("BOOKW") and one for the black moves ("BOOKB"). At build
- time, these files are bound into the executable by the resource compiler.
- The format of the binary files is documented in makebook.cpp.
-
- Since BOOKB and BOOKW are supplied with the source, you don't need to
- create these files in order to build Arasan. However, if you want to
- modify the contents of the opening book, you will need to edit the ASCII
- source for the openings and rebuild BOOKB and BOOKW. Following is
- information on how to do this.
-
- The ASCII source for the opening book is contained in the file BOOK.TXT.
- Moves are listed one or more per line in standard algebraic notation.
-
- A move may optionally be followed by a number from 0 to 9. The
- number, if given, is the "weight" assigned to the move. A weight of
- 0 means that the computer will never play the move, but will respond
- to it if it is played by its opponent. However, if a weight of 0 is
- specified after a non-zero weight for the same move in the same
- position has already been specified, the original weight is
- unchanged.
-
- The higher the weight, the more often the computer will choose the
- move compared to the other alternatives in the book. By default, if
- no weight is specified, a weight of 5 is assigned. The supplied book
- uses weights rather sparingly. In general, the computer has been
- steered away from gambits, and away from openings it doesn't play
- well. It is also biased towards playing the main line of an opening
- as opposed to less common variants. Note: I am not a strong player,
- and so the opening book is probably far from optimal. It is also not
- very extensive: currently it contains aroud 10000 half-moves, mostly
- from recent grandmaster games. Suggestions for expansion and/or
- improvement of the opening book are quite welcome.
-
- Some special symbols are allowed in the opening book. Any line
- beginning with ";" is treated as a comment and ignored. An "m" in
- the first column means that the present position should be
- remembered. An "r" in the first column restores the position to
- where it was when the last "m" was encountered. This allows easy
- entry of variant lines from the same starting position. Only one
- remembered position can be stored and retrieved, however: e.g. if you
- specify "m" twice, the first position stored is discarded.
-
- Building the opening book
-
- To build the binary opening files from BOOK.TXT, you first need to
- build the program MAKEBOOK.EXE. To do this, type "make -f
- makebook.mak". MAKEBOOK is a DOS program that uses some of the same
- files as the chess program. Since it must be built for DOS and not
- Windows, make sure you remove any old object files before building
- it, so that you don't get Windows objects mixed in with the DOS
- objects.
-
- Once MAKEBOOK has been built, just execute it and it will read BOOK.TXT
- and produce BOOKB and BOOKW. Error messages are issued if any moves
- are illegal or if there are syntax errors in the input file.
-
- Testing support
-
- Arasan includes several features to aid debugging. If you compile the
- source with -DRANGE_CHECK, checks are inserted for accessing arrays
- past their boundaries, as well as some other sanity checks. If any
- of these checks fail, an error box will be displayed with the assertion
- that failed.
-
- For debugging purposes, it is also possible to build a DOS executable
- that contains the Arasan search engine, but no user interface. This
- executable is called "TESTSRC". To build TESTSRC, do "make -f testsrc.mak".
- Since testsrc is a DOS executable but uses many of the same files as the
- Windows executable, be sure to delete any old object files before building
- TESTSRC. "make clean" will do this.
-
- TESTSRC takes a command line that consists of one of two optional
- switches followed by an optional filename. The two allowable
- switches are:
-
- -p <number> - performs a fixed-ply search to the indicated depth.
- -t <number> - searches for the given number of seconds.
-
- The filename, if specified, contains one or more board positions to be
- analyzed, one per line. The position info is in EPD (Extended
- Position Description) format. TESTSRC currently only recognizes a
- small subset of EPD, sufficient to run the tests.
-
- If the search module is compiled with -DTRACE, TESTSRC will print out
- copious information about the search process when it is run. See
- search.cpp to see what information is output and where it comes from.
- Another compile-time debugging option is -DDEBUG_ATTACKS, which will
- insert test code to check that the attack information for the board
- is updated correctly when a move is made.
-
- Because TESTSRC is a DOS program, it can easily be run from a batch
- file to execute a series of tests. The following test suites are
- provided with the source code (see the TESTS directory):
-
- 1. TESTS.EPD contains 22 fairly simple tactical problems. Arasan
- should solve all these problems with a nominal 3-ply search.
-
- 2. WAC.EPD contains the 300 problems from Reinfeld's "Win At Chess"
- book. These are mostly easy tactical problems. Arasan solves about
- 2/3 of them with a 60-second search limit on a Pentium 60.
-
- 3. The file BK.EPD contain the Bratko-Kopec series of tests. It is
- standard procedure to allow the program 2 minutes for each test.
- Some of these problems are quite difficult. Arasan currently solves
- 11 out of the 24 problems when run on a 60MHz Pentium.
-
- 4. The file ECE3100.EPD contains the first 100 problems from
- Encyclopedia of Chess Endgames, volume 3. These are rook endgames.
-
- 5. Another endgame test suite is FINE.EPD, containing 48 king and
- pawn endgame problems from Reuben Fine's Basic Chess Endings.
-
- 6. TYPP.EPD contains tests from Bellin and Ponzetto's book Test
- Your Positional Play.
-
- 7. ECM.EPD contains test positions from the Encyclopedia of Chess
- Middlegames. These are difficult middlegame problems.
-
- The RESULTS file in the TESTS directory summarizes Arasan's performance
- on these test suites. Details of the test runs can be found in the log
- files, e.g. TESTS.LOG, BK.LOG, etc., which are also in the TESTS
- directory.
-
-
- Algorithms and data structures
-
- 1. The chess board
-
- Following is some information about the algorithms and data
- structures used by Arasan. If you are new to computer chess
- programming, I suggest first reading a general work on the
- subject such as Frey (1983) or Marsland and Schaeffer (1990).
-
- The chess board in Arasan is represented by an array of 64 squares.
- Each square contains 0 if it is empty, or a piece identifier if it is
- occupied. Black pieces have identifier values between 1 and 6, while
- White pieces have values between 9 and 15. (Note: the program used
- to use a variant of the "mailbox" representation described by Frey,
- with 120 squares. This allows easy detection of when a piece has
- reached the board boundary. However, because stack space is a
- limited resource in Windows, the representation was changed to use
- only 64 squares). A special value (127) is used to represent a
- square that is uninitialized or invalid.
-
- The Board class also maintains a parallel board representation called
- "PiecePos" consisting of two arrays of 16 entries each. These contain
- lists of the squares occupied by pieces of each side. Any array
- entry unoccupied by a piece is set to 127.
-
- Several other pieces of information are stored in the Board class and
- are updated for each move. The KingPos array holds the location of
- each side's king. The PFileCount array holds a count of the number
- of pawns on each file, for each side, and the RFileCount array holds
- a similar value for rooks. The EnPassantSq array holds the square
- position, for each side, on which an en passant capture is possible
- (obviously, only the side to move can have a possible en passant
- capture, but we maintain two values for programming convenience).
- The CastleStatus array holds an enum for each side indicating whether
- castling has occurred. Also, if the king or a rook has been moved,
- making castling on one side or another impossible, CastleStatus is
- set to an appropriate value.
-
- Each board position also has a hash code associated with it. The
- hash code is 32 bits and is computed by fetching, for each piece and
- square combination, a unique 32-bit code from a table of random
- numbers, and computing the exclusive or of these codes. The
- low-order bit of the hash code is then set to identify whether White
- or Black is to move. Castling status and en passant status are not
- folded into the hash code, but are stored in a separate field in a
- hash table entry - they must match the current board in order
- for a position to be retrieved from the table.
-
- Finally, the board position includes an array, for each side, of type
- Attacks. This contains an integer for each square indicating which
- pieces and of what kinds attack the square. This information is
- incrementally updated (by the Attacks class) when a move is made or
- unmade. Attacks does not store information about "stacked"
- attackers, which may make some calculations involving this
- information inaccurate.
-
- An earlier chess program I wrote re-calculated complete attack
- information for each position that was evaluated, but it wound up
- spending a large fraction of its time in the attack calculation
- procedure. Incrementally updating the attack information is
- relatively cheap and probably faster.
-
- 2. Moves
-
- There are three move classes used in Arasan. "Move" maintains only
- the minimal information need to unambiguously specify a move: start
- square, destination square, and promotion value. "ExtendedMove"
- contains also the piece being moved, the piece being captured (if
- any), and the type of move (normal, castling, en passant, etc).
- Finally, "ReversibleMove" contains in addition the castling and en
- passant status of the board before the move was made. Since this
- information is needed for restoring the board position, only a
- ReversibleMove can be "undone".
-
- 3. Move Generator
-
- The move generation logic is mostly contained in the classes Bearing,
- Move_Generator and Move_Ordering. Bearing contains static functions
- to compute squares to which a piece can move. Except for pawn moves,
- and for castling, move generation is done by table lookup. Each
- class of piece has an associated table. Each table is indexed by
- square number and side to move, and contains a list of squares to
- which the piece could move (these tables are in beardata.h).
- Additional checks are made to see that the destination square doesn't
- contain a piece of the same color. However, moves into check are
- possibly included.
-
- The Move_Generation class uses functions from Bearing to generate all
- moves for a given side. Moves into check are included, unless the
- side to move is in check. In that case, a special function
- (Check_Evasions) is called that strictly checks moves for equality.
- It is very important to know whether any legal moves are possible
- when in check: if there are none, the side to move is checkmated.
- Also, some search extensions depend on the existence of a forced move
- (one single legal move).
-
- Move_Generation also calls the Move_Ordering class to re-order the
- moves. At ply 0, some rather elaborate criteria are used for move
- ordering: this includes a computation of each moves' positional
- score, and also (for captures), a call to function attack_estimate,
- which uses the attack information for the destination square to
- estimate the gain from a capture. At ply 0, all moves are scored and
- sorted. At higher plies, a less elaborate algorithm is used, which
- moves the first few most promising moves to the start of the move
- array, and then sorts only those. Captures are given a high priority
- in the search order - higher if the capture appears to gain material,
- but fairly high even if the capture apparently loses. Promotions are
- also given special treatment.
-
- Arasan also uses the "killer" heuristic. Moves that cause beta
- cutoff in the search are stored in a killer array, and the move
- ordering routine will give these moves a higher score than other
- moves. But captures are generally given a higher score than "killer"
- moves, so the killer moves have a relatively small effect on the
- overall move ordering. The scoring algorithms in Move_Ordering have
- been tuned to maximize search speed, mostly on the tactical problems
- in the test suite, but there is doubtless more room for improvement.
-
- 4. Searching
-
- Arasan uses an alpha-beta search algorithm with a variety of search
- extensions. The search class is the largest single module in the
- program, and is necessarily rather complicated, but I have tried to
- structure it and comment it so that it is understandable. I will
- assume that the reader knows the basics of the alpha-beta algorithm,
- and will concentrate on describing this implementation of it.
-
- In general, the search routine tries to terminate a search tree, or
- some portion of one, as soon as possible, and will defer as much work
- as possible until it is certain that no early and quicker termination
- can be done. The techniques for doing this are mostly well-known and
- there is nothing very original about the search algorithms used by
- Arasan. However, as with most chess programs, there is a fine
- balance between terminating a search too soon and extending it into
- unprofitable and very unlikely lines of play. The precise nature of
- this balance depends not only on the search algorithms used, but also
- the relative efficiency of operations such as move generation,
- position evaluation and move ordering. Each program therefore
- strikes this balance in a somewhat different way.
-
- The entry point for a search is a routine called find_best_move.
- This function does some initialization, and then calls move_search,
- which implements the alpha-beta search algorithm. The search
- proceeds one ply (half move, i.e. move by one side) at a time. That
- is, first a one-ply search is done, then a two-ply search, then
- three, etc. until either the maximum ply limit has been reached or
- the time control has been exceeded. Each search uses the results of
- the preceeding search. The variable "max_depth" holds the current
- nominal ply depth for the search. However, the presence of search
- extensions means that some nodes may be searched to a greater depth
- than this.
-
- The first step in move_search is to look in the hash table (further
- described in the next section), in order to see if an identical
- position has been visited before. This may happen due to a
- transposition of moves that lead to the same position, or because a
- previous search to a shallower depth visited the same node. If a
- hash table entry is found and if it contains a valid value (i.e. one
- that did not cause cutoff), then that value is returned immediately
- and no further searching from that node occurs. In other cases, the
- hash table may not contain an exact value, but may hold an upper or
- lower bound that can be used to narrow the alpha-beta window.
-
- After the hash table check, a further check is made to see if the current
- board position is drawn, due to insufficient material or to a 3-fold
- repetition of moves. Arasan does not currently check for draws due to
- the 50-move rule or variants of it.
-
- If the position is drawn, move_search returns the negative of the
- evaluation function. This penalizes the program for allowing a draw
- when it is ahead in material or has a superior position.
-
- If the search has reached the nominal search depth plus the maximum
- possible search extension depth, or exceeded the maximum supported
- ply depth, it also terminates immediately.
-
- Normally the initial score for the position is set to -Constants::BIG
- (BIG is a large integer used several places in the search process).
- However, a different approach is taken when pursuing search
- extensions, i.e. portions of the search tree beyond the nominal
- search depth (max_depth). Three different types of search extension
- are used in Arasan:
-
- a. If the side to move is not in check, and if the search exceeds the
- maximum depth, searching continues but includes only promotions and
- capture moves that appear to gain material. There is a limit, set in
- the Extensions structure, to how many additional plies may be
- searched.
-
- b. If the side to move is in check, the search is also extended, and
- includes all legal replies, again up to the limit set in Extensions.
-
- c. Finally, forced moves (moves with only one possible reply) cause
- the search to be extended one ply and this ply is not counted towards
- the computation of any other search limit.
-
- In case a., which in the source code is called a "capture search,"
- the current position is evaluated and searching may terminate if the
- resulting value causes beta cutoff. The reasoning is that there is
- no point continuing the search if the side to move can "stand pat"
- without making any furture captures, and still obtain a value high
- enough for cutoff. However, this early cutoff is inhibited if the
- side to move has an apparently trapped piece, or appears to have
- a "hung" piece (a piece subject to profitable capture).
-
- If early cutoff doesn't occur, the next step is to try the "null
- move.". The side to move is changed without altering the board
- position and the opposing side is then allowed to move. Of course,
- this could not occur in a game - a player is not allowed to "pass,"
- but must move. However, the theory is that if the null move causes
- cutoff, then the side to move must have a good position, since in
- effect giving the opponent a free move still produces a high value
- for the side to move. In this case, beta cutoff is allowed to occur
- and no more searching is done from this node. This search
- enhancement is enabled by compiling with -DNULL_MOVE.
-
- If the null move does not cause cutoff, then the principal variation
- move is searched. In the case of an initial search (e.g. a one-ply
- search), the principal variation move is just the first move returned
- by the move generator. Otherwise, at ply 0, it is the
- highest-scoring move from the previous search iteration. This is
- obtained by sorting the 0-ply moves from the last iteration, all of
- which were stored along with their scores. At deeper plies, the hash
- table is queried and if a best move has been stored for the position,
- that move is tried first and is considered the principal variation.
-
- The search first calls skip_move to see if the move should be
- skipped. skip_move enforces the search extension limits mentioned
- above. If the move is not to be skipped, skip_move returns "False",
- and the search proceeds to call try_move. try_move will first make
- the move, then query the attack info for the board to see if the side
- to move is in check (remember, the move generator typically does not
- exclude moves into check). If a move into check is found, the
- special value Illegal is returned and the next move is tried. If the
- move passes the legality check, then move_search is called again (it
- is thus indirectly recursive).
-
- If the return value from move_search did not cause cutoff, then the
- search must be peformed again with a wider search window. try_move
- takes care of this. try_move also checks the timer (if a timed search
- is being done) and determines when to terminate the search. Usually
- this occurs when 95% of the allotted time has been used, but in some
- cases the computer will be allowed to search a slightly longer time.
-
- Once try_move returns a value for the node, it is compared against
- alpha and beta. If the value exceeds alpha, then a new best move has
- been found at this node and the move is stored. If the value exceeds
- beta, cutoff occurs and no more searching is done. A special check
- is also made to see if the value indicates that mate has occurred,
- since there is no point searching any further after that.
-
- Assuming the principal variation move does not cause cutoff or mate,
- then move_search proceeds to search the remainder of the moves.
- These are searched with a minimal alpha-beta window. Note that since
- the principal variation move is usually obtained from the hash table
- or the ply0moves array, it may be the case that the move generator
- has never had to be called before now. If so, we call it at this
- time. An earlier version of Arasan tried to optimize things further
- by only generating part of the moves at a time. That way, if cutoff
- occurred, the remainder of the move generation could be skipped.
- However, there was a cost to this in terms of complexity and it did
- not seem to help search speed much.
-
- The final part of move_search checks to see if checkmate or stalemate
- occurred, and updates the hash table.
-
- When the top-level invocation of move_search terminates, it returns
- to find_best_move, which determines the time used, and updates the
- "Statistics" structure with the time and other information about the
- search.
-
- 5. The hash table
-
- The search routine uses a hash table for storing the results of
- evaluating previously visited positions. This table is implemented
- using a template class (Hash) because the opening book generator uses
- a similar hash table but stores somewhat different information in the
- table. The hash table is basically an array of lists (class S_List).
- Each list contains a series of nodes (S_Node), each of which contains
- some data (in the case of the search engine, a class of type
- Position_Info) and a pointer to the next node. Each list holds
- entries that hash, modulo the hash table size, to the same value.
- Each node contains the whole hash code, so that finding a given node
- to match a given hash code consists of indexing into the hash table,
- then following the list until the hash codes match.
-
- Besides the hash code, each hash entry also contains the score for
- the node, a set of flags indicating whether the value is exact, an
- upper bound or a lower bound, the depth of search used to evaluate
- the node, a word holding the castling status and en passant square,
- and the best move for the position.
-
- Under Windows, the number of lists in the hash table is configured
- at runtime based on the amount of memory available. Under DOS,
- the number of lists is fixed in size. In both cases, there is also
- a limit on how many nodes can be entered into the table. The hash
- table is cleared after each search.
-
- Memory allocation and deallocation for the hash table can be quite
- expensive, especially under Windows. To minimize this, operators
- new and delete are overloaded for both S_List and S_Node classes.
- These operators use a simple memory allocation scheme that obtains
- memory from the OS in large chunks and does suballocation out of
- the chunks (class Pool implements this). Memory freed via "delete"
- goes into a free list and is immediately available for allocation
- again via "new". Memory is not actually freed until the "freeAll"
- method is called, which occurs only when the hash table is cleared.
- Then the large memory chunks are returned to the OS.
-
- 6. Position Scoring
-
- There are six main components to the positional score used by Arasan:
-
- a. Center control
- b. Development
- c. Castling
- d. Pawn structure
- e. King safety
- f. Threats
-
- The positional score is typically within the range of plus or minus
- the value of a pawn (64), but can be greater in some circumstances.
-
- When in the endgame (determined by the amount of material on the board)
- simplified and/or different scoring parameters are used. Also,
- special scoring code is used for "mopping up" positions, in which
- one side has a large material advantage. Chess 4.5 from Northwestern
- University used a similar scheme.
-
- Center control is mostly calculated by table lookup. For sliding
- pieces, a check is made to be sure that the piece has an unobstructed
- line of movement to the center of the board. This component of the
- score is not used in the endgame.
-
- The development score encourages the program to move its pieces from
- the back rank, but discourages premature development of the queen.
- A measure of piece mobility is also calculated for bishops and rooks.
- Bonuses are awarded for a rook on the 7th rank, and also for a rook
- on an open or half-open file, and for doubled rooks.
-
- In the endgame, the development score encourages the king to move
- towards the center or opposite side of the board, and to stay
- near pawns. A bonus is also awarded for having the opposition.
- This code is not very effective in producing good play, but it
- is better than nothing. Special-case code is included for
- KPK and KNBK endgames, which enables the program to play these
- fairly well.
-
- When "mopping up", the development score gives a bonus for restricting
- the opposing king's mobility and for driving the opposing king to
- an edge or corner of the board.
-
- The castling score penalizes the program for being unable to castle,
- either temporarily (because a square traversed by castling is under
- attack) or permanently (because the king or rook has made another move).
-
- The pawn structure score penalizes isolated, backward, and doubled
- pawns, and gives a bonus for passed pawns. If a passed pawn exists,
- its value depends on its rank and also on whether squares ahead of
- it are occupied or controlled by the opposing side.
-
- The king safety score evaluates the extent to which the king is
- protected by pawns, and also the extent to which opposing pieces
- attack squares near the king. It is pretty inexact, but does
- at least alert the side to move when the king is in serious trouble.
-
- The threat score penalizes the side to move for having pieces
- that appear to be trapped or to be subject to profitable capture
- ("en prise"). A pinned piece is given the same penalty as a
- trapped piece. Usually the search routine will extend the search
- when such situations exist, but we have to do something when the
- terminal search depth is reached.
-
- Contacting the Author
-
- I have been working on computer chess programming for around seven
- years, and this is the second complete chess program I have written.
- It is still very much imperfect - but I have decided to release it
- in its present form, both so that others can enjoy playing with it
- and so that programmers can study it, learn from it, and maybe
- improve it. I don't guarantee any support for this program. However,
- if you do find bugs in it, or discover a way to improve it, I would
- like to hear from you. I can be reached at:
-
- Jon Dart
- 553 N. 17th St.
- San Jose, CA 95112
-
- Or via Internet at jdart@rational.com.
-
- References
-
- Frey, Peter W. (ed.) (1983). Chess Skill in Man and Machine.
- New York: Springer-Verlag.
-
- Marsand, T. Anthony and Schaeffer, Jonathan (1990). Computers,
- Chess and Cognition. New York: Springer-Verlag.
-
-
-